Triangular Distribution (triang)#

The triangular distribution is a simple bounded continuous distribution specified by a minimum (a), a most likely value (mode) (m), and a maximum (b).

It’s popular in simulation and decision analysis when you can elicit ((a,m,b)) from domain experts but don’t have enough data (or justification) for a richer family.

Throughout this notebook we use the non-degenerate case (a < m < b).

Notebook roadmap#

  1. Title & classification

  2. Intuition & motivation

  3. Formal definition (PDF/CDF)

  4. Moments & properties

  5. Parameter interpretation

  6. Derivations ((\mathbb{E}[X]), (\mathrm{Var}(X)), likelihood)

  7. Sampling & simulation (NumPy-only)

  8. Visualization (PDF, CDF, Monte Carlo)

  9. SciPy integration (scipy.stats.triang)

  10. Statistical use cases

  11. Pitfalls

  12. Summary

import math

import numpy as np

import plotly
import plotly.express as px
import plotly.graph_objects as go
import os
import plotly.io as pio

import scipy
from scipy import optimize
from scipy.stats import triang as triang_dist
from scipy.stats import norm

pio.templates.default = "plotly_white"
pio.renderers.default = os.environ.get("PLOTLY_RENDERER", "notebook")

SEED = 42
rng = np.random.default_rng(SEED)

np.set_printoptions(precision=6, suppress=True)

# Record versions for reproducibility (useful when numerical details matter).
VERSIONS = {"numpy": np.__version__, "scipy": scipy.__version__, "plotly": plotly.__version__}
VERSIONS
{'numpy': '1.26.2', 'scipy': '1.15.0', 'plotly': '6.5.2'}

Prerequisites & notation#

Prerequisites

  • comfort with basic calculus (piecewise integrals)

  • basic probability (PDF/CDF, expectation)

Two parameterizations

  1. Endpoint + mode (used in this notebook):

  • (a): lower bound

  • (m): mode (most likely value)

  • (b): upper bound

with (a < m < b) and support (x\in[a,b]).

  1. SciPy (scipy.stats.triang) uses:

  • shape (c\in(0,1)) = mode location as a fraction of the interval

  • loc (\in\mathbb{R})

  • scale (>0)

Mapping: [ \text{loc}=a,\quad \text{scale}=b-a,\quad c=\frac{m-a}{b-a},\quad m=a+c,(b-a). ]

1) Title & Classification#

  • Name: triang (Triangular distribution)

  • Type: continuous

  • Support: (x \in [a,b])

  • Parameter space (endpoint + mode form):

    • (a < b)

    • (a < m < b)

SciPy parameter space:

  • (c \in (0,1)), loc (\in\mathbb{R}), scale (>0)

  • support is (x\in[\text{loc},\text{loc}+\text{scale}])

2) Intuition & Motivation#

What it models#

A triangular distribution encodes a bounded uncertainty with a single most-likely value:

  • you believe values cannot go below (a) or above (b)

  • values near (m) are most plausible

  • plausibility changes linearly as you move away from (m)

Typical real-world use cases#

  • Project planning / PERT-style estimation: durations with (min, most likely, max)

  • Risk analysis: costs, demand, supply with hard bounds and a best guess

  • Monte Carlo simulation: quick, interpretable priors for bounded variables

  • Measurement / rounding uncertainty: bounded error with a most likely offset

Relations to other distributions#

  • Uniform: if you only know ([a,b]) but not a “most likely” value, the uniform is a natural baseline.

  • Symmetric triangular: when (m=(a+b)/2), the distribution is symmetric (skewness 0).

  • Irwin–Hall (sum of uniforms): if (U_1,U_2\stackrel{iid}{\sim}\text{Unif}(0,1)), then (U_1+U_2) has a symmetric triangular density on ([0,2]).

  • Beta / PERT: Beta-PERT is a smoother alternative that also uses (min, mode, max) but yields differentiable densities.

  • Piecewise-Beta view: the left side behaves like a scaled (\text{Beta}(2,1)), the right side like a scaled (\text{Beta}(1,2)).

3) Formal Definition#

Let (X\sim\text{Triang}(a,m,b)) with (a<m<b).

PDF#

The density is piecewise linear:

[ f(x\mid a,m,b) = \begin{cases} \dfrac{2(x-a)}{(b-a)(m-a)}, & a\le x\le m,\[6pt] \dfrac{2(b-x)}{(b-a)(b-m)}, & m< x\le b,\[6pt] 0, & \text{otherwise.} \end{cases} ]

The maximum density (at (x=m)) is always (;\frac{2}{b-a};) because the graph is a triangle with base (b-a) and area 1.

CDF#

[ F(x\mid a,m,b)= \begin{cases} 0, & x<a,\[6pt] \dfrac{(x-a)^2}{(b-a)(m-a)}, & a\le x\le m,\[8pt] 1-\dfrac{(b-x)^2}{(b-a)(b-m)}, & m< x\le b,\[8pt] 1, & x>b. \end{cases} ]

We’ll implement these directly in NumPy and compare to scipy.stats.triang.

def _validate_triang_params(a: float, m: float, b: float) -> None:
    if not (np.isfinite(a) and np.isfinite(m) and np.isfinite(b)):
        raise ValueError("Parameters must be finite.")
    if not (a < m < b):
        raise ValueError("Require a < m < b (non-degenerate triangular distribution).")


def triang_params_to_scipy(a: float, m: float, b: float) -> tuple[float, float, float]:
    '''Map (a, m, b) -> (c, loc, scale) for scipy.stats.triang.'''
    _validate_triang_params(a, m, b)
    scale = b - a
    c = (m - a) / scale
    return float(c), float(a), float(scale)


def scipy_params_to_triang(c: float, loc: float, scale: float) -> tuple[float, float, float]:
    '''Map (c, loc, scale) from scipy.stats.triang -> (a, m, b).'''
    if not (np.isfinite(c) and np.isfinite(loc) and np.isfinite(scale)):
        raise ValueError("Parameters must be finite.")
    if not (0.0 < c < 1.0):
        raise ValueError("Require 0 < c < 1 for a non-degenerate mode inside the interval.")
    if not (scale > 0.0):
        raise ValueError("Require scale > 0.")
    a = loc
    b = loc + scale
    m = loc + c * scale
    return float(a), float(m), float(b)


def triang_pdf(x: np.ndarray, a: float, m: float, b: float) -> np.ndarray:
    '''PDF of Triang(a, m, b) evaluated at x (NumPy-only).'''
    _validate_triang_params(a, m, b)
    x = np.asarray(x, dtype=float)

    pdf = np.zeros_like(x, dtype=float)
    left = (x >= a) & (x <= m)
    right = (x > m) & (x <= b)

    pdf[left] = 2.0 * (x[left] - a) / ((b - a) * (m - a))
    pdf[right] = 2.0 * (b - x[right]) / ((b - a) * (b - m))
    return pdf


def triang_cdf(x: np.ndarray, a: float, m: float, b: float) -> np.ndarray:
    '''CDF of Triang(a, m, b) evaluated at x (NumPy-only).'''
    _validate_triang_params(a, m, b)
    x = np.asarray(x, dtype=float)

    cdf = np.zeros_like(x, dtype=float)

    left = (x >= a) & (x <= m)
    right = (x > m) & (x <= b)
    above = x > b

    cdf[left] = (x[left] - a) ** 2 / ((b - a) * (m - a))
    cdf[right] = 1.0 - (b - x[right]) ** 2 / ((b - a) * (b - m))
    cdf[above] = 1.0
    return cdf


def triang_ppf(u: np.ndarray, a: float, m: float, b: float) -> np.ndarray:
    '''Inverse CDF (percent point function) of Triang(a, m, b) (NumPy-only).'''
    _validate_triang_params(a, m, b)
    u = np.asarray(u, dtype=float)
    if np.any((u < 0.0) | (u > 1.0)):
        raise ValueError("u must lie in [0,1].")

    p = (m - a) / (b - a)  # F(m)

    x = np.empty_like(u, dtype=float)
    left = u < p
    right = ~left

    x[left] = a + np.sqrt(u[left] * (b - a) * (m - a))
    x[right] = b - np.sqrt((1.0 - u[right]) * (b - a) * (b - m))
    return x


def triang_rvs_numpy(size: int, a: float, m: float, b: float, rng: np.random.Generator) -> np.ndarray:
    '''Random variates from Triang(a, m, b) using inverse transform sampling (NumPy-only).'''
    u = rng.random(size)
    return triang_ppf(u, a, m, b)


def triang_mean(a: float, m: float, b: float) -> float:
    _validate_triang_params(a, m, b)
    return float((a + m + b) / 3.0)


def triang_second_moment(a: float, m: float, b: float) -> float:
    _validate_triang_params(a, m, b)
    return float((a * a + b * b + m * m + a * b + a * m + b * m) / 6.0)


def triang_variance(a: float, m: float, b: float) -> float:
    _validate_triang_params(a, m, b)
    return float((a * a + b * b + m * m - a * b - a * m - b * m) / 18.0)


def triang_skewness(a: float, m: float, b: float) -> float:
    _validate_triang_params(a, m, b)
    delta = a * a + b * b + m * m - a * b - a * m - b * m
    num = math.sqrt(2.0) * (a + b - 2.0 * m) * (2.0 * a - b - m) * (a - 2.0 * b + m)
    den = 5.0 * (delta ** 1.5)
    return float(num / den)


def triang_excess_kurtosis() -> float:
    # Property of the triangular family: constant excess kurtosis.
    return float(-3.0 / 5.0)


def triang_entropy(a: float, b: float) -> float:
    '''Differential entropy in nats; depends only on the interval length (b-a).'''
    if not (np.isfinite(a) and np.isfinite(b)):
        raise ValueError("Parameters must be finite.")
    if not (b > a):
        raise ValueError("Require b > a.")
    return float(0.5 + math.log((b - a) / 2.0))


def triang_mgf(t: np.ndarray, a: float, m: float, b: float) -> np.ndarray:
    '''Moment generating function M(t)=E[e^{tX}] (NumPy-only).

    Closed form for t != 0; uses a 2nd-order Taylor expansion for small |t| to avoid cancellation.
    '''

    _validate_triang_params(a, m, b)
    t = np.asarray(t, dtype=float)

    out = np.empty_like(t, dtype=float)
    small = np.abs(t) < 1e-6

    if np.any(small):
        mu = triang_mean(a, m, b)
        ex2 = triang_second_moment(a, m, b)
        ts = t[small]
        out[small] = 1.0 + mu * ts + 0.5 * ex2 * (ts**2)

    if np.any(~small):
        tt = t[~small]
        num = 2.0 * ((b - m) * np.exp(tt * a) + (m - a) * np.exp(tt * b) - (b - a) * np.exp(tt * m))
        den = (b - a) * (b - m) * (m - a) * (tt**2)
        out[~small] = num / den

    return out


def triang_cf(omega: np.ndarray, a: float, m: float, b: float) -> np.ndarray:
    '''Characteristic function φ(ω)=E[e^{i ω X}] (NumPy-only).'''

    _validate_triang_params(a, m, b)
    omega = np.asarray(omega, dtype=float)

    out = np.empty_like(omega, dtype=complex)
    small = np.abs(omega) < 1e-6

    if np.any(small):
        mu = triang_mean(a, m, b)
        ex2 = triang_second_moment(a, m, b)
        w = omega[small]
        out[small] = 1.0 + 1j * mu * w - 0.5 * ex2 * (w**2)

    if np.any(~small):
        w = omega[~small]
        num = 2.0 * ((b - m) * np.exp(1j * w * a) + (m - a) * np.exp(1j * w * b) - (b - a) * np.exp(1j * w * m))
        den = (b - a) * (b - m) * (m - a) * ((1j * w) ** 2)
        out[~small] = num / den

    return out


def triang_loglik(a: float, m: float, b: float, x: np.ndarray) -> float:
    '''Log-likelihood for i.i.d. observations x under Triang(a, m, b).'''

    _validate_triang_params(a, m, b)
    x = np.asarray(x, dtype=float)

    # Strict interior support avoids log(0).
    if np.any((x <= a) | (x >= b)):
        return -np.inf

    left = x <= m
    right = ~left

    n = x.size
    n_left = int(left.sum())
    n_right = n - n_left

    ll = n * math.log(2.0) - n * math.log(b - a)
    ll += float(np.sum(np.log(x[left] - a)) - n_left * math.log(m - a))
    ll += float(np.sum(np.log(b - x[right])) - n_right * math.log(b - m))

    return float(ll)
# Sanity checks: compare our NumPy formulas to SciPy.

a, m, b = -1.0, 0.3, 2.0
c, loc, scale = triang_params_to_scipy(a, m, b)
rv = triang_dist(c, loc=loc, scale=scale)

x = np.linspace(a - 0.5, b + 0.5, 800)

pdf_np = triang_pdf(x, a, m, b)
cdf_np = triang_cdf(x, a, m, b)

pdf_sp = rv.pdf(x)
cdf_sp = rv.cdf(x)

print("pdf max abs diff:", float(np.max(np.abs(pdf_np - pdf_sp))))
print("cdf max abs diff:", float(np.max(np.abs(cdf_np - cdf_sp))))

mu_np = triang_mean(a, m, b)
var_np = triang_variance(a, m, b)
skew_np = triang_skewness(a, m, b)
kurt_np = triang_excess_kurtosis()

mu_sp, var_sp, skew_sp, kurt_sp = rv.stats(moments="mvsk")

print("mean   (np, scipy):", mu_np, float(mu_sp))
print("var    (np, scipy):", var_np, float(var_sp))
print("skew   (np, scipy):", skew_np, float(skew_sp))
print("kurt   (np, scipy):", kurt_np, float(kurt_sp))

print("entropy (np, scipy):", triang_entropy(a, b), float(rv.entropy()))
pdf max abs diff: 2.220446049250313e-16
cdf max abs diff: 2.220446049250313e-16
mean   (np, scipy): 0.43333333333333335 0.43333333333333335
var    (np, scipy): 0.37722222222222224 0.37722222222222224
skew   (np, scipy): 0.1292309797579062 0.12923097975790612
kurt   (np, scipy): -0.6 -0.6
entropy (np, scipy): 0.9054651081081644 0.9054651081081645

4) Moments & Properties#

Because the support is bounded, all moments exist and the MGF exists for all real (t).

Mean and variance#

[ \mathbb{E}[X] = \frac{a+m+b}{3}, \qquad \mathrm{Var}(X) = \frac{a^2+b^2+m^2-ab-am-bm}{18}. ]

A convenient second moment (useful for Taylor expansions): [ \mathbb{E}[X^2] = \frac{a^2+b^2+m^2+ab+am+bm}{6}. ]

Skewness and kurtosis#

Skewness (third standardized moment) is: [ \gamma_1 = \frac{\sqrt{2},(a+b-2m),(2a-b-m),(a-2b+m)}{5,\big(a^2+b^2+m^2-ab-am-bm\big)^{3/2}}. ]

Excess kurtosis is constant for the whole family: [ \gamma_2 = -\frac{3}{5}. ]

MGF and characteristic function#

For (t\neq 0), the MGF has a compact closed form: [ M(t)=\mathbb{E}[e^{tX}] = \frac{2\Big((b-m)e^{ta} + (m-a)e^{tb} - (b-a)e^{tm}\Big)}{(b-a)(b-m)(m-a),t^2}. ]

The characteristic function is (\varphi(\omega)=M(i\omega)).

Entropy#

The differential entropy (in nats) depends only on the interval length: [ H(X)=\frac12 + \log\Big(\frac{b-a}{2}\Big). ]

So entropy does not depend on the mode location (m).

a, m, b = 0.0, 0.2, 1.0

print("mean:", triang_mean(a, m, b))
print("var:", triang_variance(a, m, b))
print("skew:", triang_skewness(a, m, b))
print("excess kurtosis:", triang_excess_kurtosis())
print("entropy:", triang_entropy(a, b))

# MGF/CF demo: recover mean as M'(0) and show |φ(ω)| ≤ 1

t_grid = np.array([-1e-4, 0.0, 1e-4])
mgf_vals = triang_mgf(t_grid, a, m, b)

# Centered finite-difference slope at 0
mean_fd = (mgf_vals[-1] - mgf_vals[0]) / (t_grid[-1] - t_grid[0])
print("mean from finite diff M'(0):", float(mean_fd))

w = np.linspace(0, 60, 300)
phi = triang_cf(w, a, m, b)
print("max |phi(ω)|:", float(np.max(np.abs(phi))))

# Compare to SciPy
c, loc, scale = triang_params_to_scipy(a, m, b)
rv = triang_dist(c, loc=loc, scale=scale)
print("SciPy mvsk:", tuple(float(v) for v in rv.stats(moments="mvsk")))
print("SciPy entropy:", float(rv.entropy()))
mean: 0.39999999999999997
var: 0.04666666666666667
skew: 0.47613605131159786
excess kurtosis: -0.6
entropy: -0.1931471805599453
mean from finite diff M'(0): 0.4010680676452827
max |phi(ω)|: 1.0
SciPy mvsk: (0.39999999999999997, 0.04666666666666667, 0.47613605131159786, -0.6)
SciPy entropy: -0.1931471805599453

5) Parameter Interpretation#

  • (a) (lower bound) shifts the distribution left/right.

  • (b) (upper bound) sets the right endpoint.

  • The scale is (b-a): widening the interval spreads the distribution and increases variance (\propto (b-a)^2).

  • (m) (mode) controls asymmetry:

    • if (m) is close to (a), most mass is near the left endpoint with a long right tail (positive skew)

    • if (m) is close to (b), the distribution is negatively skewed

    • if (m=(a+b)/2), the distribution is symmetric

A nice geometric fact: the peak height (f(m)=2/(b-a)) depends only on the interval length, not on (m). Moving (m) changes the slopes of the two sides.

a, b = 0.0, 1.0
m_values = [0.05, 0.2, 0.5, 0.8, 0.95]

x = np.linspace(a, b, 600)
fig = go.Figure()

for m in m_values:
    fig.add_trace(
        go.Scatter(
            x=x,
            y=triang_pdf(x, a, m, b),
            mode="lines",
            name=f"m={m:.2f}, skew={triang_skewness(a, m, b):+.3f}",
        )
    )

fig.update_layout(title="Triangular PDF for different mode locations (a=0, b=1)", xaxis_title="x", yaxis_title="pdf")
fig.show()

6) Derivations#

Expectation#

The PDF graph is a triangle with vertices ((a,0)), ((m, 2/(b-a))), ((b,0)). The x-coordinate of a triangle’s centroid is the average of the vertices’ x-coordinates, so: [ \mathbb{E}[X] = \frac{a+m+b}{3}. ]

Variance#

Compute the second moment via piecewise integration: [ \mathbb{E}[X^2] = \int_a^m x^2,\frac{2(x-a)}{(b-a)(m-a)},dx

  • \int_m^b x^2,\frac{2(b-x)}{(b-a)(b-m)},dx = \frac{a^2+b^2+m^2+ab+am+bm}{6}. ]

Then [ \mathrm{Var}(X)=\mathbb{E}[X^2]-\mathbb{E}[X]^2 =\frac{a^2+b^2+m^2-ab-am-bm}{18}. ]

Likelihood (i.i.d. sample)#

For observations (x_1,\dots,x_n) inside ((a,b)), the log-likelihood is a sum of two parts depending on whether (x_i\le m) or (x_i>m): [ \ell(a,m,b) = \sum_{i=1}^n \log f(x_i\mid a,m,b). ]

Let (\mathcal{L}={i:x_i\le m}) and (\mathcal{R}={i:x_i>m}). Using the PDF definition: [ \ell(a,m,b)= n\log 2 - n\log(b-a)

  • |\mathcal{L}|\log(m-a) - |\mathcal{R}|\log(b-m)

  • \sum_{i\in\mathcal{L}} \log(x_i-a) + \sum_{i\in\mathcal{R}} \log(b-x_i). ]

Because the partition (\mathcal{L}/\mathcal{R}) changes when (m) crosses a data point, the likelihood is not smooth in (m); derivative-free optimizers are often convenient.

def triang_fit_mle(x: np.ndarray) -> dict:
    '''Simple MLE fit for (a, m, b) using a reparameterization and Nelder–Mead.'''

    x = np.asarray(x, dtype=float)
    if x.ndim != 1:
        raise ValueError("x must be 1D")
    if not np.all(np.isfinite(x)):
        raise ValueError("x must be finite")

    xmin = float(np.min(x))
    xmax = float(np.max(x))
    span = xmax - xmin
    if span <= 0:
        raise ValueError("Need at least two distinct points to fit Triang.")

    # Initial guess: slightly extend beyond the data range to avoid log(0).
    a0 = xmin - 0.05 * span
    b0 = xmax + 0.05 * span
    scale0 = b0 - a0

    mu = float(np.mean(x))
    c0 = float(np.clip((mu - a0) / scale0, 1e-3, 1 - 1e-3))

    # Parameterization:
    # a = a
    # scale = exp(s)
    # c = (tanh(u)+1)/2  in (0,1)
    theta0 = np.array([a0, math.log(scale0), math.atanh(2 * c0 - 1)])

    def unpack(theta: np.ndarray) -> tuple[float, float, float]:
        a, s, u = theta
        scale = float(math.exp(s))
        c = float(0.5 * (math.tanh(u) + 1.0))
        b = a + scale
        m = a + c * scale
        return float(a), float(m), float(b)

    def nll(theta: np.ndarray) -> float:
        a, m, b = unpack(theta)
        ll = triang_loglik(a, m, b, x)
        return np.inf if not np.isfinite(ll) else -ll

    res = optimize.minimize(nll, theta0, method="Nelder-Mead", options={"maxiter": 5000})
    a_hat, m_hat, b_hat = unpack(res.x)
    return {
        "a": a_hat,
        "m": m_hat,
        "b": b_hat,
        "success": bool(res.success),
        "nll": float(res.fun),
        "message": str(res.message),
    }


# Fit demo on synthetic data
true_a, true_m, true_b = 0.0, 0.3, 1.0
x = triang_rvs_numpy(2000, true_a, true_m, true_b, rng=rng)

fit = triang_fit_mle(x)
fit
{'a': 0.004055796177896897,
 'm': 0.29460842560364947,
 'b': 1.0028833997301208,
 'success': True,
 'nll': -377.97189131874006,
 'message': 'Optimization terminated successfully.'}

7) Sampling & Simulation (NumPy-only)#

Inverse transform sampling#

If (U\sim\text{Unif}(0,1)), then (X=F^{-1}(U)) has the desired distribution.

For the triangular CDF, define the split point: [ p = F(m)=\frac{m-a}{b-a}. ]

Then the inverse CDF is: [ F^{-1}(u)= \begin{cases} a + \sqrt{u,(b-a)(m-a)}, & 0\le u < p,\[6pt] b - \sqrt{(1-u),(b-a)(b-m)}, & p\le u\le 1. \end{cases} ]

This is exactly what triang_ppf implements.

a, m, b = 2.0, 3.0, 10.0
n = 200_000
samples = triang_rvs_numpy(n, a, m, b, rng=rng)

print("sample mean vs theory:", float(samples.mean()), triang_mean(a, m, b))
print("sample var  vs theory:", float(samples.var()), triang_variance(a, m, b))
sample mean vs theory: 5.000138145172893 5.0
sample var  vs theory: 3.168342390818645 3.1666666666666665

8) Visualization#

We’ll visualize:

  • the PDF (piecewise linear)

  • the CDF (piecewise quadratic)

  • a Monte Carlo histogram with the theoretical PDF overlaid

a, m, b = 0.0, 0.3, 1.0
c, loc, scale = triang_params_to_scipy(a, m, b)
rv = triang_dist(c, loc=loc, scale=scale)

x = np.linspace(a, b, 700)

# PDF
fig_pdf = go.Figure()
fig_pdf.add_trace(go.Scatter(x=x, y=triang_pdf(x, a, m, b), mode="lines", name="NumPy PDF"))
fig_pdf.add_trace(go.Scatter(x=x, y=rv.pdf(x), mode="lines", name="SciPy PDF", line=dict(dash="dash")))
fig_pdf.update_layout(title="Triangular PDF", xaxis_title="x", yaxis_title="pdf")

# CDF
fig_cdf = go.Figure()
fig_cdf.add_trace(go.Scatter(x=x, y=triang_cdf(x, a, m, b), mode="lines", name="NumPy CDF"))
fig_cdf.add_trace(go.Scatter(x=x, y=rv.cdf(x), mode="lines", name="SciPy CDF", line=dict(dash="dash")))
fig_cdf.update_layout(title="Triangular CDF", xaxis_title="x", yaxis_title="cdf")

# Monte Carlo
n = 60_000
s = triang_rvs_numpy(n, a, m, b, rng=rng)

fig_mc = px.histogram(s, nbins=60, histnorm="probability density", title="Monte Carlo samples")
fig_mc.add_trace(go.Scatter(x=x, y=triang_pdf(x, a, m, b), mode="lines", name="theoretical pdf"))
fig_mc.update_layout(xaxis_title="x", yaxis_title="density")

fig_pdf.show()
fig_cdf.show()
fig_mc.show()

9) SciPy Integration#

scipy.stats.triang implements the triangular family with the standard frozen-distribution API:

  • pdf, cdf, ppf

  • rvs

  • stats, entropy

  • fit (MLE)

Remember SciPy’s shape parameter is not the mode; it’s the normalized mode location (c=(m-a)/(b-a)).

a, m, b = -1.0, 0.2, 2.5
c, loc, scale = triang_params_to_scipy(a, m, b)

rv = triang_dist(c, loc=loc, scale=scale)

x = np.linspace(a, b, 6)
print("x:", x)
print("pdf:", rv.pdf(x))
print("cdf:", rv.cdf(x))
print("rvs(5):", rv.rvs(size=5, random_state=rng))

# Fit example (MLE)
data = rv.rvs(size=3000, random_state=rng)

c_hat, loc_hat, scale_hat = triang_dist.fit(data)
a_hat, m_hat, b_hat = scipy_params_to_triang(c_hat, loc_hat, scale_hat)

print("SciPy fit (c, loc, scale):", (float(c_hat), float(loc_hat), float(scale_hat)))
print("SciPy fit mapped (a, m, b):", (a_hat, m_hat, b_hat))
x: [-1.  -0.3  0.4  1.1  1.8  2.5]
pdf: [0.       0.333333 0.521739 0.347826 0.173913 0.      ]
cdf: [0.       0.116667 0.452174 0.756522 0.93913  1.      ]
rvs(5): [ 0.428577  0.39136   1.017822  1.494647 -0.385795]
SciPy fit (c, loc, scale): (0.3331743756744738, -0.9843826668287806, 3.4809642734577553)
SciPy fit mapped (a, m, b): (-0.9843826668287806, 0.17538543172565524, 2.4965816066289745)

10) Statistical Use Cases#

Hypothesis testing#

If parameters are known a priori, you can test whether data plausibly came from (\text{Triang}(a,m,b)) using a one-sample goodness-of-fit test (e.g. Kolmogorov–Smirnov).

If you estimate parameters from the same data, classical KS p-values are no longer exact (the null is “composite”). In that case, use a parametric bootstrap for calibrated p-values.

Bayesian modeling#

Triangular distributions make convenient bounded priors when you have expert knowledge in (min, mode, max) form.

Generative modeling#

They are useful building blocks in simulation pipelines whenever you need:

  • bounded support

  • a single mode

  • a simple sampling routine

# Hypothesis testing demo (parameters known): KS test
from scipy.stats import kstest

a, m, b = 0.0, 0.3, 1.0
c, loc, scale = triang_params_to_scipy(a, m, b)
rv = triang_dist(c, loc=loc, scale=scale)

x = rv.rvs(size=400, random_state=rng)
stat, pval = kstest(x, rv.cdf)
print("KS statistic:", float(stat))
print("p-value:", float(pval))
KS statistic: 0.03351144747930468
p-value: 0.7469182474457042
# Bayesian modeling demo: triangular prior + normal likelihood (grid posterior)

# Unknown parameter theta, prior Triang(a,m,b)
a, m, b = 0.0, 0.6, 1.0
sigma = 0.12
obs = 0.72

theta = np.linspace(a, b, 2000)
prior = triang_pdf(theta, a, m, b)
lik = norm.pdf(obs, loc=theta, scale=sigma)
post_unnorm = prior * lik
post = post_unnorm / np.trapz(post_unnorm, theta)

# Summaries
post_mean = float(np.trapz(theta * post, theta))
post_cdf = np.cumsum(post) * (theta[1] - theta[0])
post_median = float(theta[np.searchsorted(post_cdf, 0.5)])

print("posterior mean:", post_mean)
print("posterior median:", post_median)

fig = go.Figure()
fig.add_trace(go.Scatter(x=theta, y=prior, mode="lines", name="prior (triangular)"))
fig.add_trace(go.Scatter(x=theta, y=lik / np.trapz(lik, theta), mode="lines", name="likelihood (normalized)", line=dict(dash="dash")))
fig.add_trace(go.Scatter(x=theta, y=post, mode="lines", name="posterior"))
fig.add_vline(x=obs, line_dash="dot", line_color="gray")
fig.update_layout(title="Bayesian update with triangular prior", xaxis_title="theta", yaxis_title="density")
fig.show()
posterior mean: 0.6803733951541852
posterior median: 0.6798399199599799
# Generative modeling demo: bounded task durations and project totals

# Each task duration is modeled with (min, mode, max)
a, m, b = 2.0, 3.0, 10.0
n_projects = 30_000
n_tasks = 6

durations = triang_rvs_numpy(n_projects * n_tasks, a, m, b, rng=rng).reshape(n_projects, n_tasks)
project_total = durations.sum(axis=1)

fig = px.histogram(project_total, nbins=60, title=f"Total duration for {n_tasks} triangular tasks")
fig.update_layout(xaxis_title="total time", yaxis_title="count")
fig.show()

11) Pitfalls#

  • Invalid parameters: you need (a<m<b). Values too close together make denominators tiny.

  • Support is closed but the PDF is 0 at the endpoints: if your data includes exact endpoints, the continuous model assigns probability 0, yielding (-\infty) log-likelihood.

  • SciPy parameterization: scipy.stats.triang(c, loc, scale) uses (c) as a fraction of the interval, not the mode.

  • MLE non-smoothness: the likelihood changes form when (m) crosses a data point; derivative-free optimization is often easier.

  • MGF/CF numerical cancellation: the closed forms divide by (t^2) and can lose precision near 0; a Taylor expansion (as implemented above) stabilizes evaluation.

12) Summary#

  • triang is a continuous, bounded distribution on ([a,b]) with a single mode (m).

  • The PDF is piecewise linear and the CDF is piecewise quadratic.

  • Mean and variance have simple closed forms: (\mathbb{E}[X]=(a+m+b)/3), (\mathrm{Var}(X)=(a^2+b^2+m^2-ab-am-bm)/18).

  • Excess kurtosis is constant (-3/5), and differential entropy depends only on (b-a).

  • Sampling is easy via inverse CDF; SciPy provides a full-featured implementation via scipy.stats.triang.